feature/#397 scenario duplication #2373

toan-quach · 2024-12-27T07:16:59Z

What type of PR is this? (check all applicable)

Description

Related Tickets & Documents

#397

github-actions · 2024-12-27T07:29:31Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
19786	17236	87%	0%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
taipy/core/_repository/_filesystem_repository.py	89%	🟢
taipy/core/data/_data_manager.py	99%	🟢
taipy/core/data/_file_datanode_mixin.py	98%	🟢
taipy/core/data/csv.py	98%	🟢
taipy/core/data/data_node.py	97%	🟢
taipy/core/data/excel.py	84%	🟢
taipy/core/data/json.py	97%	🟢
taipy/core/data/parquet.py	97%	🟢
taipy/core/data/pickle.py	100%	🟢
taipy/core/scenario/_scenario_manager.py	97%	🟢
taipy/core/taipy.py	91%	🟢
taipy/core/task/_task_manager.py	97%	🟢
taipy/core/task/task.py	98%	🟢
TOTAL	96%	🟢

updated for commit: ffeb1c8 by action🐍

tests/core/scenario/test_scenario_manager.py

taipy/core/scenario/_scenario_manager.py

taipy/core/data/_data_manager.py

taipy/core/data/_file_datanode_mixin.py

taipy/core/scenario/_scenario_manager.py

taipy/core/task/_task_manager.py

taipy/core/data/csv.py

…an_duplicate functions

taipy/core/_repository/_filesystem_repository.py

jrobinAV · 2025-01-21T14:13:52Z

taipy/core/data/_file_datanode_mixin.py

-            self.__logger.error(f"Error uploading `{up_path.name}` to data "
-                                f"node `{self.id}`:")  # type: ignore[attr-defined]
+            self.__logger.error(f"Error uploading `{up_path.name}` to data " f"node `{self.id}`:")  # type: ignore[attr-defined]


Please check your automatic formatter as this line is over 120 chars. Please revert the change.

taipy/core/scenario/_scenario_manager.py

trgiangdo

I believe there are still a few areas in the code that can be improve

trgiangdo · 2025-01-23T10:06:13Z

taipy/core/data/excel.py

+    def _duplicate_data(self):
+        new_data_path = self._duplicate_data_file(self.id)
+        if hasattr(self._properties, "_entity_owner"):
+            del self._properties._entity_owner
+        self._properties[self._PATH_KEY] = new_data_path
+        return new_data_path


Is this repetitive?
Can we put this in the parent _FileDataNodeMixin class and override it when necessary?

hmmm I think it's a yes and no answer. The self._properties attribute doesn't exist in _FileDataNodeMixin, but since ExcelDN, etc. are implementing _FileDataNodeMixin, it can access it. But the syntax feels off for me, and since _FileDataNodeMixin is a Mixin, I'm not comfortable putting self._properties in it. (I understand that we do have similar things for def path(self, value) (setter) in _FileDataNodeMixin, but I'm still not sure if it's the best way to do it).

I understand your point. However, other methods in the Mixin class doesn't seem to fit as well.

I think we did discuss this once. @jrobinAV Let us know what you think about this

Yes, this is debatable. I would personally do the same as the path setter for consistency.

trgiangdo · 2025-01-23T10:13:47Z

taipy/core/data/_file_datanode_mixin.py

+            if base_name.startswith(self.__TAIPY_CLONED_PREFIX):
+                base_name = "".join(base_name.split("_")[5:])
+            new_base_path = os.path.join(folder_path, f"{self.__TAIPY_CLONED_PREFIX}_{id}_{base_name}")


I have 2 questions.
What would be the new file name if a data node is duplicated twice?
Similarly, if a duplicated data node is duplicated (dn -> duplicated_dn -> duplicated duplicated_dn)?

yep so take example.csv as the file name, after you first duplicate the dn, it will create a new file TAIPY_CLONED_new_dn_id_example.csv. But if you want to duplicate this newly duplicated dn, another file will be created with the name pattern being TAIPY_CLONED_another_new_dn_id_example.csv. The TAIPY_CLONED prefix won't be repeated.

I see. So what about duplicating a data node twice?

then we will have TAIPY_CLONED_dn_duplicated_id_1_example.csv for the 1st dn, and for 2nd dn, we will have TAIPY_CLONED_dn_duplicated_id_2_example.csv

I believe having both the prefix TAIPY_CLONED and the unique ID in the name is not necessary. If we can make the file name unique with the new ID, what is the purpose of keeping the prefix? We can probably propose something slightly better differentiating the case where the file name is generated by taipy or if it is provided by the user.

If the file of the initial dn is generated (dn.is_generated), we can simply generate a new name for the duplicate with the same function (dn._build_path). Otherwise, we can use your proposal without the prefix: {new_id}_{base_name}.

What do you think?

Well because I think the prefix will make the naming clear as to what it is, just the id is fine but we will have to state clearly in the doc that this is what we generated. While they can understand it just from the prefix without doc.

taipy/core/scenario/_scenario_manager.py

taipy/core/data/_data_manager.py

tests/core/test_taipy.py

jrobinAV · 2025-01-27T14:49:28Z

taipy/core/data/_file_datanode_mixin.py

+            if base_name.startswith(self.__TAIPY_CLONED_PREFIX):
+                base_name = "".join(base_name.split("_")[5:])
+            new_base_path = os.path.join(folder_path, f"{self.__TAIPY_CLONED_PREFIX}_{id}_{base_name}")


I believe having both the prefix TAIPY_CLONED and the unique ID in the name is not necessary. If we can make the file name unique with the new ID, what is the purpose of keeping the prefix? We can probably propose something slightly better differentiating the case where the file name is generated by taipy or if it is provided by the user.

If the file of the initial dn is generated (dn.is_generated), we can simply generate a new name for the duplicate with the same function (dn._build_path). Otherwise, we can use your proposal without the prefix: {new_id}_{base_name}.

What do you think?

tests/core/data/test_csv_data_node.py

taipy/core/scenario/_scenario_manager.py

taipy/core/data/_file_datanode_mixin.py

tests/core/data/test_csv_data_node.py

jrobinAV

I believe I know what's bugging me.
It's a matter of responsibility.

In short, I believe the data duplication should not be the responsibility of the data node.
In my mind, the data node class only knows itself and does not have a global view of all the data nodes, or scenarios. Duplicating data should be the responsibility of classes that have an overview of the situation. Either the managers or new classes dedicated to the data duplication.

Give me some time to propose something else

jrobinAV · 2025-02-06T09:48:04Z

taipy/core/data/_file_datanode_mixin.py

            reasons._add_reason(self.id, DataNodeEditInProgress(self.id))  # type: ignore[attr-defined]
            return reasons

        up_path = pathlib.Path(path)
        try:
            upload_data = self._read_from_path(str(up_path))
        except Exception as err:
-            self.__logger.error(f"Error uploading `{up_path.name}` to data "
-                                f"node `{self.id}`:")  # type: ignore[attr-defined]
+            self.__logger.error(f"Error uploading `{up_path.name}` to data " f"node `{self.id}`:")  # type: ignore[attr-defined]


Suggested change

self.__logger.error(f"Error uploading `{up_path.name}` to data " f"node `{self.id}`:") # type: ignore[attr-defined]

self.__logger.error(f"Error uploading `{up_path.name}` to data "

f"node `{self.id}`:") # type: ignore[attr-defined]

toan-quach marked this pull request as draft December 27, 2024 07:17

toan-quach force-pushed the feature/#397-duplicate-scenarios branch from 658f02f to 2644479 Compare December 27, 2024 07:17

jrobinAV added Core Related to Taipy Core Core: 📁 Data node Core: 🎬 Scenario & Cycle 🟨 Priority: Medium Not blocking but should be addressed labels Jan 6, 2025

jrobinAV assigned toan-quach Jan 6, 2025

toan-quach force-pushed the feature/#397-duplicate-scenarios branch from 2644479 to f79d591 Compare January 13, 2025 07:26

jrobinAV requested changes Jan 15, 2025

View reviewed changes

tests/core/scenario/test_scenario_manager.py Outdated Show resolved Hide resolved

toan-quach marked this pull request as ready for review January 15, 2025 13:10

jrobinAV reviewed Jan 15, 2025

View reviewed changes

toan-quach requested review from trgiangdo, jrobinAV and joaoandre-avaiga January 20, 2025 09:59

Toan Quach and others added 9 commits January 21, 2025 16:57

draft for scenario duplication

9fe22e3

added cloning data files

aec8bbe

added tests for copying data files

c034db4

added tests for cloning entities

2df565c

fixed prevent replacing current in_memory entity with cloned entity

3303259

added checking existing task and cycle

f7ef682

added tests for cloning with same or different cycle

3a3dbe6

fixed failing tests

a70b107

added remove Taipy clone prefix when cloning multiple times and add c…

5b53c5f

…an_duplicate functions

toan-quach force-pushed the feature/#397-duplicate-scenarios branch from fe1826e to 5b53c5f Compare January 21, 2025 09:57

jrobinAV reviewed Jan 21, 2025

View reviewed changes

clean up code

ac0e57e

jrobinAV self-requested a review January 22, 2025 08:29

jrobinAV previously approved these changes Jan 22, 2025

View reviewed changes

expose duplicate and can duplicate function in taipy

4d5f118

toan-quach dismissed jrobinAV’s stale review via 4d5f118 January 22, 2025 08:58

make linter happy

ffeb1c8

jrobinAV previously approved these changes Jan 22, 2025

View reviewed changes

toan-quach mentioned this pull request Jan 23, 2025

Possibility to duplicate scenarios #397

Open

toan-quach linked an issue Jan 23, 2025 that may be closed by this pull request

Possibility to duplicate scenarios #397

Open

trgiangdo reviewed Jan 23, 2025

View reviewed changes

jrobinAV reviewed Jan 27, 2025

View reviewed changes

add duplicating sequences

1b473a4

toan-quach dismissed jrobinAV’s stale review via 1b473a4 February 5, 2025 09:30

minor refactor

acaa0c1

jrobinAV previously approved these changes Feb 5, 2025

View reviewed changes

minor refactor

8c69b76

toan-quach dismissed jrobinAV’s stale review via 8c69b76 February 6, 2025 07:48

jrobinAV requested changes Feb 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/#397 scenario duplication #2373

feature/#397 scenario duplication #2373

toan-quach commented Dec 27, 2024

github-actions bot commented Dec 27, 2024 •

edited

Loading

jrobinAV Jan 21, 2025

trgiangdo left a comment

trgiangdo Jan 23, 2025

toan-quach Jan 23, 2025

trgiangdo Jan 23, 2025

jrobinAV Jan 27, 2025

trgiangdo Jan 23, 2025

toan-quach Jan 23, 2025

trgiangdo Jan 23, 2025

toan-quach Jan 24, 2025

jrobinAV Jan 27, 2025

toan-quach Feb 5, 2025

jrobinAV Jan 27, 2025

jrobinAV left a comment

jrobinAV Feb 6, 2025

	self.__logger.error(f"Error uploading `{up_path.name}` to data " f"node `{self.id}`:") # type: ignore[attr-defined]
	self.__logger.error(f"Error uploading `{up_path.name}` to data "
	f"node `{self.id}`:") # type: ignore[attr-defined]

feature/#397 scenario duplication #2373

Are you sure you want to change the base?

feature/#397 scenario duplication #2373

Conversation

toan-quach commented Dec 27, 2024

What type of PR is this? (check all applicable)

Description

Related Tickets & Documents

github-actions bot commented Dec 27, 2024 • edited Loading

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Choose a reason for hiding this comment

trgiangdo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrobinAV left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 27, 2024 •

edited

Loading